Machine Learning for Cybersecurity (MLC)

Keynote Speakers

Title: Data-Centric Evaluation of Large Code Models
Qiang Hu, Tianjin University
Abstract: Evaluation is a key step in the development lifecycle of large code models, and also a hot research direction in this era. Data, as a fundamental element of artificial intelligence, plays a crucial role in the evaluation of large code models. In this context, this talk focuses on data-centric techniques and best practices for large code model evaluation. Specifically, I will introduce our recent research works on building high-quality code evaluation benchmark datasets and automating the acquisition of important evaluation data, to explore the impact of evaluation data quality on large code model evaluation results. These works have been published in ASE 2025 and TOSEM. Dr. He is an Associate Professor at Tianjin University. Previously, he was a Postdoctoral Researcher at the University of Tokyo, working with Prof. Lei Ma. He received his Ph.D. degree from the University of Luxembourg, advised by Prof. Yves Le Traon, Prof. Mike Papadakis, and Prof. Maxime Cordy. Before that, he received his Master’s degree from Kyushu University, advised by Prof. Jianjun Zhao.
Title: Toward Intelligent and Automated Binary Security Analysis with LLMs
Yaowen Zheng, Chinese Academy of Sciences
Abstract: The rapid development of large language models (LLMs) is transforming the field of software and binary security analysis. With their assistance, the entire vulnerability analysis process has become more accurate, intelligent, and automated. Traditional approaches rely heavily on manually crafted rules and expert-defined features, which limits their adaptability to diverse architectures and complex binary environments. Motivated by the strong code understanding and reasoning capabilities of LLMs, we have explored their application across multiple binary security analysis tasks and achieved encouraging results. In this keynote, I will discuss recent advances in applying LLMs to binary analysis and present our two representative research works, LLM for taint analysis (LATTE) and patch presence testing (Lares), to demonstrate how LLMs can advance the automation and intelligence of real-world binary security analysis. Dr. Zheng is a Professor at the Institute of Information Engineering, Chinese Academy of Sciences. Previously, he was a Research Fellow at Nanyang Technological University under the supervision of Professor Yang Liu. He received his Ph.D. in Cyberspace Security from the University of Chinese Academy of Sciences under the supervision of Professor Limin Sun, and also conducted research at the University of California, Riverside, with Professor Heng Yin. His research focuses on system security, particularly on vulnerability analysis techniques for IoT systems and firmware, encompassing both static analysis methods such as taint analysis and dynamic techniques including emulation-based fuzzing. More recently, his interests have expanded to exploring LLM-empowered security and embodied AI security.
Title: Machine Learning for Cellular Network Security: Challenges, Tools, and Opportunities
Imtiaz Karim, The University of Texas at Dallas
Abstract: Cellular networks are the bedrock of modern communication. The recent deployment of 5G has generated further enthusiasm and opportunities in both academia and industry. Therefore, the security of cellular networks is critical. In this talk, I will elaborate on the essential challenges of ensuring cellular network security and move on to my research on using ML to enhance the resilience of the networks. I will begin by discussing the analysis of 4G/5G specifications and introducing CellularLint, which uses a revamped few-shot learning mechanism on domain-adapted Large Language Models (LLMs) to detect inconsistencies in 4G and 5G specifications. Then, I will discuss an ML-based defensive approach, termed FBSDetector, which is devised to detect and defend against threats such as Fake Base Stations and multi-step attacks. I will conclude by outlining some of the challenges and opportunities of using ML for ensuring the security and privacy of a highly specialized domain, such as 5G and NextG protocols and systems. Dr. Imtiaz is an Assistant Professor in the Department of Computer Science at the University of Texas at Dallas. Previously, he was a Postdoctoral Researcher in the Department of Computer Science at Purdue University. He completed his Ph.D. from the same department in Spring 2023. He leads the System and Network Security (SysNetS) lab at UTD. His research lies in the general area of systems and network security. More specifically, his focus is on ensuring the security and privacy of wireless communication protocols (e.g., cellular networks—4G/5G, Bluetooth, VoWiFi, vehicular, WiFi, and IoT) with respect to their design and implementation. His research aims to develop tools that systematically analyze real-world systems and widely used protocols using AI (ML and NLP), formal verification, program analysis, and software testing techniques. Furthermore, with the advent of the next generation of networks (6G and beyond), his future goal is to ensure the resilience (reliability, adaptability, and security) of future network generations and to develop protocols and systems that are robust and secure by design.

Call For Paper

In the past decades, cybersecurity threats have been among the most significant challenges for social development resulting in financial loss, violation of privacy, damages to infrastructures, etc. Organizations, governments, and cyber practitioners tend to leverage state-of-the-art Artificial Intelligence technologies to analyze, prevent, and protect their data and services against cyber threats and attacks. Due to the complexity and heterogeneity of security systems, cybersecurity researchers and practitioners have shown increasing interest in applying data mining methods to mitigate cyber risks in many security areas, such as malware detection and essential player identification in an underground forum. To protect the cyber world, we need more effective and efficient algorithms and tools capable of automatically and intelligently analyzing and classifying the massive amount of data in cybersecurity complex scenarios. This workshop will focus on empirical findings, methodological papers, and theoretical and conceptual insights related to data mining in the field of cybersecurity.

The workshop aims to bring together researchers from cybersecurity, data mining, and machine learning domains. We encourage a lively exchange of ideas and perceptions through the workshop, focused on cybersecurity and data mining. Topics of interest include, but are not limited to:

Security and privacy of Data Mining
Data mining and AI applications for cybersecurity
Data mining approaches to enhance security and resiliency of cyber systems
Human behavior models with application to cybersecurity
Data mining for cybersecurity software verification and validation
AI tools and techniques to enhance resilience in cybersecurity
AI-enabled automation of cybersecurity tasks
Data-driven cybersecurity Analytics
Uncertainty-based decision making in cybersecurity
Modeling and simulation of cyber systems and system components

We are interested in the new applications of data mining and AI for cybersecurity. Submitted papers will be evaluated based on criteria such as technical originality, creativity, and applicability. Methodological topics of interest include, but are not limited to:

Adversarial machine learning
Cryptography
Large language models
Interpretable deep learning
Multi-view deep learning paradigms
Real-time analytics and deep learning on stream data
Deep Transfer learning
Deep reinforcement learning
Bayesian deep learning
Graph convolutional networks and graph attention networks

Application areas of interest include, but are not limited to:

Malware evasion and detection
Cyber threat detection and modeling
Intrusion detection
Internet of Things (IoT) analysis
AI-enabled security information sharing and automation across multiple data sources
Smart and large-scale vulnerability assessment
Intelligent security systems with human-in-the-loop
Vulnerability detection
Anomaly detection
Dark Web Analytics for CTI applications
IP reputation analysis

Important Dates

Paper Submission Deadline: September 5, 2025
Acceptance Notification: September 18, 2025
Camera-ready Submission: September 25, 2025
Workshop Date: November 14, 2025 (Massachusetts Room)

Paper Submission

By the ICDM tradition, All accepted workshop papers will be published in the ICDMW proceedings published by the IEEE Computer Society Press, and will be accessible in the IEEE Computer Society Digital Library (CSDL) and the IEEE Xplore, and indexed by EI.
Submission Format: Paper submissions should be limited to max 8 pages plus 2 extra pages (for references, appendix, etc.) and follow the IEEE ICDM format. All submissions will be triple-blind reviewed by the Program Committee based on technical quality, relevance to scope of the conference, originality, significance, and clarity. The following sections give further information for authors. Please refer to the ICDM 2024 call for papers .
Submission website: Please submit your papers via the Submission Website
Presentation Format: Physical attendance is NOT mandatory this year due to potential visa issues. Remote presentations through Zoom are allowed. Meeting coordinates will be announced later to authors of accepted papers.
Registration is mandatory for accepted papers.

Organizers

Steering Chairs


Ali Babar University of Adelaide	Battista Biggio University of Cagliari	Elisa Bertino Purdue University

Hsinchun Chen University of Arizona	Yang Liu Nanyang Technological University	Xinming (Simon) Ou University of South Florida